FNP - Data Structures - CRDT Conflict-Free Merge Semantics
Summary (Explain Like I’m 5)
Imagine you and your friend are writing a story at the same time on different sheets of paper: Problem: You both add sentences. Later you combine papers. Which sentence comes first? What if you both deleted something? Solution - CRDT:- Each sentence gets a unique invisible timestamp (position identifier)
- When combining papers, the order is always the same, no matter who combines them
- No conflicts because positions never overlap
- Everyone ends up with the exact same story
Technical Deep Dive
CRDT (Conflict-free Replicated Data Type) is a data structure that guarantees:- Eventual Consistency → All replicas reach same state
- Commutativity → Order of operations doesn’t matter
- Idempotence → Applying operation twice = applying once
- Deterministic Merge → No conflicts, no user intervention needed
1. Commutativity Proof (Why order doesn’t matter)
For any two operations op₁ and op₂:2. Idempotence Guarantee
3. Deterministic Merge Algorithm
4. Tombstone Deletion (No erasure, just marking)
Mermaid Diagrams
Key Terms
- Commutativity → Operator order irrelevant; a + b = b + a
- Idempotence → f(f(x)) = f(x); applying twice = applying once
- Eventual Consistency → All replicas eventually reach same state
- Deterministic Merge → Same operation order → same result always
- Tombstone → Mark as deleted, don’t erase (LSEQ keeps history)
- Total Ordering → Unique positions create strict ordering (no ties)
- Causality → Happens-before relation defines operation order
- Conflict-Free → No user resolution needed; math handles it
Q/A
Q: Why can’t CRDT operations conflict? A: CRDT position IDs are unique (site_id + clock). No two operations occupy same position. Total ordering of positions means merge order is irrelevant—results identical regardless. Q: What if two users insert at same position simultaneously? A: LSEQ breaks ties with site_id (Alice vs Bob). Deterministic tiebreaker ensures all replicas rank them same way. One comes first (by site_id), no ambiguity. Q: Does Commutativity mean I can apply operations in any order? A: Yes for CRDT operations. Apply op₁ then op₂ = Apply op₂ then op₁. This enables replicas with different operation order to converge identically. Q: What’s the difference between Commutativity and Idempotence? A: Commutativity: op₁ ∘ op₂ = op₂ ∘ op₁ (order). Idempotence: op ∘ op = op (repetition). CRDT ensures both—order doesn’t matter AND duplicate applications don’t break things. Q: Why use tombstones for deletion instead of actually erasing? A: If Alice deletes, Bob hasn’t received delete yet. If actually erased, Bob’s replica diverges. Tombstones mark invisible but keep history. When Bob’s delete arrives, causality ensures consistent deletion. Q: Can CRDT handle concurrent deletes of same char? A: Yes. Both delete operations target same position_id. Both increment delete_timestamp. Merge sees two “delete” markers—same position marked deleted. Result: converged invisibility. Q: What breaks Commutativity? A: Non-CRDT operations break it. Example: “insert at position 5” (absolute position). If Alice inserts first, Bob’s position 5 is now different. Merge order matters—not commutative. CRDT fixes this.Example / Analogy
Collaborative Playlist Analogy: Without CRDT (conflicts):- Alice adds song “A” at position 1
- Bob adds song “B” at position 1
- When merging, which song is at position 1? Conflict!
- Manager must choose: “Alice wins” or “Bob wins”
- Alice adds “A” at position (1.0, Alice_ID)
- Bob adds “B” at position (1.0, Bob_ID)
- Merge compares: (1.0, Alice_ID) < (1.0, Bob_ID)?
- Yes! → Order is: [A, B]
- Even if Bob merges first, positions are absolute
- Result: All listeners see [A, B] ← no conflict!
Cross-References: LSEQ CRDT Identifiers, Protocol Flow, Distributed Systems Category: Data Structures | CRDT | Consensus | Distributed Algorithms Difficulty: Advanced ⭐⭐⭐⭐⭐ Updated: 2025-11-28